Exchange server 2010 : Troubleshooting Tools (part 1)

11/17/2010 7:49:19 PM

Even a well-designed and operated Exchange system will eventually experience problems that you need to identify and repair. The previous section explained the troubleshooting methodology. This section highlights some of the top tools available for troubleshooting.

1. Identifying and Resolving Performance Problems

The Exchange Management Console (EMC) ships with a collection of troubleshooting and diagnostic tools. A number of the tools are hosted within the EMC, whereas some, such as the Best Practices Analyzer, are separate executables that can be launched from within the toolbox or have to be installed separately.

This section discusses a few of the top tools in more detail. You can also use a number of additional tools. Table 1 lists these tools and their functions.

Table 1. Troubleshooting Tools
TOOL	DESCRIPTION
DNSLint	DNSLint can be used to help diagnose common DNS configuration errors across multiple DNS servers. DNSLint is useful for identifying connectivity issues.
Error Code Lookup (Err.exe)	Use for Error Code Lookup to determine error values from decimal and hexadecimal error codes. Use in conjunction with error codes.
Event Viewer (eventvwr.msc)	An MMC snap-in to view logged events. Use Event Viewer in any case as a starting point.
LDP (ldp.exe)	A GUI tool used to perform LDAP operations (connect, bind, search, modify, add, delete) against Active Directory or LDAP compatible directory. LDP can display all property information about objects in the directory. LDP can help identify user configuration issues.
Process Monitor	A monitoring tool for Windows that shows real-time information from the file system, registry, and process/threads. Process Monitor is helpful for diagnosing performance issues.
Microsoft Product Support Reports	The Microsoft Product Support Reports tool gathers critical system and logging information useful for troubleshooting support issues. This tool is useful for general data gathering. Under most circumstances Microsoft PSS will send the customer a link to run the tool.

1.1. Microsoft Exchange Best Practices Analyzer

The Microsoft Exchange Best Practices Analyzer (ExBPA) is a tool to help administrators assess the health of their servers and topology. The tool scans the live environment and compares the results against a vast list of best practices defined by Microsoft. It pulls information from Active Directory, the registry, WMI, the IIS metabase, and Performance Monitor. Additionally, it collects useful information about the Exchange organization. Because it is a stand-alone tool, it has its own help file and is not integrated into the standard Exchange help. You have a lot of flexibility when running the tool. In most cases ExBPA will also display information on how to correct the identified issue. You can scan the entire Exchange organization or scope it down to a single server. Several types of scans are available, as shown in Table 2.

Table 2. ExBPA Scan Types
SCAN TYPE	SCAN ACTIONS	WHEN TO USE
Health Check	Performs a full scan checking for errors, warning, non-default configurations, and configuration changes. Optionally, it can take samples of performance counters over a two-hour period.	Use this scan type to check the health of the organization or to troubleshoot a specific problem.
Permissions Check	Scans Active Directory domain naming context and the Exchange configuration naming context.	Run this scan if you suspect a permissions access issue.
Connectivity Check	This scan tries to validate all network connectivity and Active Directory access.	This scan helps troubleshoot network connectivity issues. It can be very useful if you have firewalls in the topology.
Baseline	The Baseline check allows an administrator to set threshold values that will be checked against the server's actual configuration.	Useful to report on deviations from baseline values.

One interesting way to use ExBPA is to compare information between multiple runs to see what configuration items have changed. If an issue arises, it is easy to compare a new report against a known good baseline and quickly spot any differences.

1.2. Microsoft Exchange Performance Troubleshooter

The Microsoft Exchange Performance Troubleshooter (ExPTA) helps locate performance related issues. As you can see in Figure 1 , an administrator selects the type of performance symptoms he is seeing. RPC issues generally affect client performance, so RPC issues can be a common area administrators need to troubleshoot.

Figure 1. ExPTA

The ExPTA collects configuration information and live performance counter information to analyze each subsystem to determine bottlenecks that can affect RPC calls. For example, it collects disk, memory, LDAP, and event viewer data in its analysis. Like the ExBPA tool, the ExPTA also makes recommendations on how to correct any issues it identifies. Keep in mind that it is best to run this tool while experiencing performance issues. Although you can check all of the performance counters manually, this tool greatly speeds up the troubleshooting process.

1.3. Exchange Profile Analyzer

The Exchange Profile Analyzer (ExPA) collects statistical information from a single- or multiple-mailbox database across the Exchange organization. The reports include detailed information such as the average message size, how large mailboxes are, message counts, and recipient information. This tool mainly helps with capacity planning. For example, when planning an Exchange Server 2010 Mailbox server, the tool can gather information that can be later used to complete the Exchange Mailbox Role Calculator spreadsheet.

Also included in the installation package is the OWA Profile Analyzer. This tool reports on information such as logon/logoffs and detailed mail operations. Again, this is useful for reporting on trending information and capacity planning. Instead of guessing how many users actually use OWA, you can get solid numbers for reporting.

1.4. Client-Side Issues

Service Pack 1 introduces a new cmdlet for fixing mailbox issues named repair-mailbox. This cmdlet can be used to detect and fix the following types of mailbox corruptions:

Search folder corruptions
Aggregate counts on folders not reflecting correct values
Views on folders not returning correct contents
Provisioned folders incorrectly pointing into unprovisioned parents or vice versa

When running this task with the database online, mailbox access will be disrupted only for the mailbox that is being repaired. All other mailboxes on the database or server will still be operational.

Another useful pair of troubleshooting tools are built right into Outlook 2007 and 2010. The Connection Status dialog box shows the client's current connection status and useful information such as the response time and request failures. Occasionally when the network status changes, Outlook fails to reconnect automatically. This sometimes occurs when switching between a wired and wireless connection or enabling a connection to a corporate VPN. If Outlook does not automatically reconnect, a user can click the Reconnect button forcing Outlook to attempt to restore the client's server connection. The second tool is the Test E-mail AutoConfiguration tool. This is useful for diagnosing issues with AutoDiscover or the services that AutoDiscover returns.

2. Identifying and Resolving Mail Flow Issues

A number of tools are available to help you troubleshoot mail flow issues. The queue viewer, mail flow troubleshooter, and transport logs all help identify the problem.

2.1. Transport Logs

Sometimes the administrator needs detailed information to troubleshoot mail flow issues. Fortunately, Exchange provides multiple ways to get behind-the-scenes information to troubleshoot root cause. Table 3 shows a summary of the different logs available within Exchange.

Table 3. Transport Log Summary
LOG	DETAIL	USED FOR
Connectivity Logs	Records connection activity of outbound message delivery queues.	Troubleshooting problems with messages reaching their destination Mailbox server, Send connector, or domain.
Protocol Logs	Tracks SMTP communication between Exchange servers as part of message routing and delivery. Other protocols can be enabled, such as POP, IMAP, and HTTP.	Troubleshooting message delivery from Send and Receive connectors.
Routing Table Log	A snapshot of the message routing table used by the Hub Transport or Edge Servers.	Troubleshooting internal message delivery.
Message Tracking Logs	Track the flow of messages between servers.	Troubleshooting message delivery and determining the status of a message.
Agent Logs	Records actions performed on messages by specific anti-spam agents on Edge or Hub Transport servers.	Troubleshooting messages that have been acted upon by anti-spam agents.

2.1.1. Connectivity Logs

The connectivity logs can be used to get connectivity information from the Hub Transport servers and Edge servers with their destination servers. The information in the log is detailed with connection information and is helpful in troubleshooting outbound mail flow issues. The connectivity logs do not contain message information, only information from the mail flow process.

2.1.2. Protocol Logs

The protocol logs are disabled by default, and can be enabled or disabled on a per-connector basis. A number of settings are configured on the connector and some are set on the server and apply across connectors.

Protocol logs should only be enabled when you are troubleshooting because they can impact the performance of the server. By default, the logs will only consume 250 MB of disk space, but this may need to be increased because the logs can grow quickly. Microsoft IT notes that they capture between 5 and 15 GB of protocol logs per day on their Edge servers.

2.1.3. Routing Table Log

The Routing Table Log is enabled by default. The table is recalculated and logged after a routing change or every 12 hours by default. The Microsoft Exchange Transport Service is responsible for this log, and runs on every Hub Transport or Edge server. The Routing Table Log viewer is located in the EMC toolbox, and can be used to read local or remote logs. The routing log can be used to validate Active Directory's configuration information. Within the log is information on site and routing groups, servers, Send connectors, and address spaces.

2.1.4. Agent Logs

The agent logs can be useful for an administrator who wants to understand why an action was taken on a message because of the anti-virus agents running on the Edge or Hub Transport servers. The following agents can write information to this log:

Connection Filter Agent
Content Filter Agent
Edge Rules Agent
Recipient Filter Agent
Sender Filter Agent
Sender ID Agent

The information written to the log depends upon which agent and action was performed.

2.1.5. Message Tracking Logs

A very common scenario for administrators is tracking down message delivery. Users report that they sent a message and it was never received, or that they were expecting a message and it never arrived. Exchange 2010 provides a new feature called Delivery Reports that allows users and administrators to easily retrieve transport information about messages. Delivery reports will help answer questions about whether or when messages were delivered by providing the following information based on role-based access security. The information is listed in Table 4 . This table shows how security rights affect what information is available in a report or even what report is available.

Table 4. Delivery Report Information
EVENT	MANAGEMENT ROLE (ROLE GROUP)
E-mail Submission from the Sender's Mailbox	MyBaseOptions (Default) Message Tracking (Organization Management) View-Only Recipients (Help Desk)
Group Expansion	MyBaseOptions (Default) Message Tracking (Organization Management) View-Only Recipients (Help Desk)
Delivery Success	MyBaseOptions (Default) Message Tracking (Organization Management) View-Only Recipients (Help Desk)
Delivery Failure	MyBaseOptions (Default) Message Tracking (Organization Management) View-Only Recipients (Help Desk)
Inbox Rules	Message Tracking (Recipient Management) Message Tracking (Organization Management) View-Only Recipients (Help Desk)
Transport Rules	Message Tracking (Organization Management) View-Only Recipients (Help Desk)
Message was read (if enabled)	MyBaseOptions (Default) Message Tracking (Organization Management) View-Only Recipients (Help Desk)
Hub Transfers	Message Tracking (Organization Management) View-Only Recipients (Help Desk)
Transfer to External Servers	MyBaseOptions (Default) Message Tracking (Organization Management) View-Only Recipients (Help Desk)
Transfer to Older Versions	MyBaseOptions (Default) Message Tracking (Organization Management) View-Only Recipients (Help Desk)
Moderation	MyBaseOptions (Default) Message Tracking (Organization Management) View-Only Recipients (Help Desk)

Users can access delivery reports from OWA by clicking the Options button, opening the Exchange control panel, clicking the Organize E-Mail tab and then selecting the Delivery Reports option. Additionally, right-clicking any message in OWA will display the Open Delivery Report option. Administrators can access Delivery Reports from the Exchange Control Panel on the Reporting tab, with PowerShell cmdlets, or within the Exchange Management Console in the Toolbox Message Tracking application. An example of a delivery report is shown in Figure 2.

Figure 2. Delivery report

The Delivery Report tool uses data from the message tracking logs, which by default keep this data for two weeks. It is important to configure Message Tracking to match the log file data for the same length of time—tracking depends on the log data being available at each hop. If a mailbox is moved to a different server, message tracking can no longer follow the path of the message and may fail. Thus, Delivery Reports are only available for the messages in a mailbox that was generated on the server where it is currently located. The Delivery Report tracking works by the method illustrated in Figure 3 and the following steps.

ECP calls the Search-MessageTrackingReport task with the parameters of the search.
The Search-MessageTrackingReport task locates the sender's Mailbox server.
The Log Search Service on Mailbox Server1 is queried to determine the message's next hop.
The Log Search Service on Hub Transport1 is queried to determine the message's next hop.

Figure 3. Message tracking
Tracking determines that the message crossed the forest/site boundary.
Tracking next contacts Client Access Server2 via EWS in the remote site.
Client Access Server2 queries the Log Search Service on Hub Transport2.
The Log Search Service on Mailbox Server2 is queried.
Delivery status information is returned to Client Access Server2.
Client Access Server2 returns delivery status information to Client Access Server1.
The task merges all of the results and returns them to the user.

Service Pack 1 also provides the ability to track a message after it has been queued for delivery. Users and Exchange Server Administrators can now be informed in case of a delay that a message will not meet a delivery SLA.

2.2. Managing Queues

Queues are a necessary part of transport. Queues allow organizations to not necessarily architect solutions around peak traffic, which can be costly. For example, if traffic spikes once a month during regular business cycles, it may not make sense to build a platform that during non-peak periods is severely underutilized. Queues also help with taking the responsibility of redelivery when the remote server is not responding. In any case, it is important to monitor the queues for unexpected activity, which may indicate a problem. The Exchange Management Shell (EMS) and Exchange Management Console (EMC) have interfaces to view the status and contents of these queues, and also the ability to perform actions on the messages or queues.

Exchange actually uses several queues during normal mail transport. Like the other troubleshooting tools, the Queue Viewer is located in the EMC in the toolbox. Figure 4 shows the Queue Viewer Console.

Figure 4. Queue Viewer

The Queue Viewer will open the local transport database if one is available, but it can connect to any Hub transport database. Edge Transport servers, on the other hand, can only view their local transport database. The console is fairly basic and has tabs that display the available queues, or the messages contained within that queue. Clicking the Create Filter button allows an administrator to view only queues that match the filter conditions. For example, Figure 5 shows a filter that when applied will only show queues in a suspended state. This is very helpful when there are many queues and it is difficult to find the information you are looking for.

Figure 5. Queue Viewer filter

Selecting a queue in the main window will create another tab where filters can be applied. You can also apply filters to the Messages tab. The filter for messages includes the ability to filter out subjects, specific source IP addresses, and even filters based on SCL value. The message view also shows the current status of the message. The status can be one of the following:

Active If in the delivery queue, the message is being delivered to the next hop. If in the submission queue, the categorizer is processing the message.
Pending Remove An administrator has removed the message—it is already in the delivery queue. The message will be deleted if it reenters the queue because of an error, but will be delivered otherwise.
Pending Suspend An administrator has suspended the message, but it is already in the delivery queue. The message will be suspended if it reenters the queue because of an error, but will be delivered otherwise.
Ready The message is waiting to be processed.
Retry The message could not be delivered during the last attempt. Transport will attempt redelivery of the message.
Suspended The processing of the message has been suspended and no further actions will be taken on the message until an administrator resumes the message.

2.3. Message Latency

A frequent question when message tracking is "Why did the message take so long to be delivered?" For example, a user reports a message took over an hour to be delivered. The scripts directory includes a script that converts raw latency information into human-readable form. From within the scripts directory, run the following cmdlet:

Get-messagetrackinglog -messageid:"<7590C0B7CDB495033BF129504CE4859002394BCB831210
[email protected]>" | ? {$_.MessageLatencyType -eq 'EndToEnd'} |
ConvertTo-MessageLatency | FT -a ComponentServerFqdn,ComponentCode,ComponentLatency

The preceding command produces the following output:

ComponentServerFqdn  ComponentCode     ComponentLatency
-------------------  -------------     ----------------
MBX01.contoso.com        TOTAL        00:00:01
MBX01.contoso.com        MSSN         00:00:01
HUB01.contoso.com        TOTAL        00:00:09
HUB01.contoso.com        SDS          00:00:05
HUB01.contoso.com        CAT          00:00:01
HUB01.contoso.com        SDD          00:00:01

This shows that the Hub Transport and Mailbox servers handled delivery with a latency of about 9 milliseconds. In this case, the latency exists outside of the Exchange organization. This data is exposed for monitoring purposes through the MSExchange Transport Component Latency performance counter. This counter provides the latency attributed to specific instances of the object, such as the submission queue, delivery queue, and the categorizer. The object provides latency information according to fiftieth, eightieth, ninetieth, ninety-fifth, and ninety-ninth percentile of messages processed over the last 5-minute intervals. For example, if 99 percent of 100 messages processed in the last 5 minutes had a latency of 50 seconds or less, the Percentile 99 counter for the Total Server Latency instance would be 50.

In this example, SMTP servers are inside the organization, but not part of the Exchange organization. It is possible to include non-Exchange servers in the latency calculations. Add the IP range to the InternalSMTPServers property using the Set-TransportConfig cmdlet. The External Servers instance is also included on the perfmon object.

2.4. Mail Flow Troubleshooter Tool

Another tool available to Administrators for troubleshooting mail flow related issues is the Mail Flow Troubleshooter. The troubleshooter can be found with the other troubleshooting utilities in the Toolbox in the EMC. The tool can help with a wide variety of scenarios. Figure 6 shows the tool's initial page. Depending on which symptoms you see, the tool will require different information to automatically diagnose the data. The tool will present an analysis of the possible root causes and suggests corrective actions.